An Efficient Approach Generating Optimized Clusters for Theoretic Clustering Using Data Mining

نویسندگان

  • Ankita G. Joshi
  • ANKITA G. JOSHI
  • R. R. SHELKE
چکیده

The aim of the data mining process is to extract information from a large data set and transform it into an understandable structure for further use. Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes. Using a broad range of techniques, you can use this information to increase revenues, cut costs, improve customer relationships, and reduce risks and more. Clustering is a main task of exploratory data analysis and data mining applications. Theoretic Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). Centralizing the whole data at one source is not possible and it is impractical. Thus, there is great demand for distributed data clustering algorithms. Distributed clustering is to explore the hidden structure of the data collected/stored in geographically distributed nodes. Information theoretic measures take the whole distribution of cluster data into account for optimized clustering results. For this, we incorporate an information theoretic measure into the cost function of the distributed clustering. The proposed Diffused Kernel DMMI algorithm with association rule mining can achieve optimized cluster results on textual and numeric data in less time. Keywords— ―Data mining‖, ―Theoretic clustering‖, ―clustering‖, ―Mutual information‖, ―Diffusion‖

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A clustering approach for mineral potential mapping: A deposit-scale porphyry copper exploration targeting

This work describes a knowledge-guided clustering approach for mineral potential mapping (MPM), by which the optimum number of clusters is derived form a knowledge-driven methodology through a concentration-area (C-A) multifractal analysis. To implement the proposed approach, a case study at the North Narbaghi region in the Saveh, Markazi province of Iran, was investigated to discover porphyry ...

متن کامل

Applying a decision support system for accident analysis by using data mining approach: A case study on one of the Iranian manufactures

Uncertain and stochastic states have been always taken into consideration in the fields of risk management and accident, like other fields of industrial engineering, and have made decision making difficult and complicated for managers in corrective action selection and control measure approach. In this research, huge data sets of the accidents of a manufacturing and industrial unit have been st...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

Subjectively Interesting Alternative Clusters

We deploy a recently proposed framework for mining subjectively interesting patterns from data [3] to the problem of clustering, where patterns are clusters in the data. This framework outlines how subjective interestingness of patterns (here, clusters) can be quantified using sound information theoretic concepts. We demonstrate how it motivates a new objective function quantifying the interest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015